Recommendations for Qualitative Ontology Matching Evaluations

نویسندگان

Aliaksandr Autayeu

Vincenzo Maltese

Pierre Andrews

چکیده

This paper suggests appropriate rules to set up ontology matching evaluations and for golden standard construction and use which can significantly improve the quality of the precision and recall measures. We focus on the problem of evaluating ontology matching techniques [1] which find mappings with equivalence, less general, more general and disjointness, and on how to make the evaluation results fairer and more accurate. The literature discusses the appropriateness and quality of the measures [2], but contains little about evaluation methodology [3]. Closer to us, [4] raises the issue of evaluating non-equivalence links. Golden standards (GS) are fundamental for evaluating the precision and recall [2]. Typically, hand-made positive (GS) and negative (GS−) golden standards contain links considered correct and incorrect, respectively. Ideally, GS− complements GS, leading to a precise evaluation. Yet, in big datasets annotating all links is impractical and golden standards are often a sample of all node pairs, leading to approximate evaluations [5]. However, most current evaluation campaigns tend to use tiny ontologies, risking biased or poorly significant results. Recommendation 1. Use large golden standards. Include GS− for a good approximation of the precision and recall. To be statistically significant, cover in GS and GS− an adequate portion of all node pairs. In a sampled GS, results reliability depends on: (a) the portion of the pairs covered; (b) the ratio between GS and GS− sizes and (c) their quality (see last recommendation). Most matching tools produce equivalence, some also produce less general and more general relations, but few output disjointness [6]. This must be taken into account to correctly compare evaluations. Usually, only the presence of a relation is evaluated, regardless the kind. Moreover, disjointness (two completely unrelated nodes) is often confused with overlap (two nodes whose intersection is not empty) and both are put in the GS− [5]. This leads to imprecise results. Recommendation 2. When presenting evaluation results, specify whether and how the evaluation takes into account the semantic relations kind. We use the notion of redundancy [7] to judge the quality of a golden standard. We use the Min(mapping) function to remove redundant links (producing the minimized mapping) and the Max(mapping) function to add all redundant links (producing the maximized mapping). Following [7] and staying within lightweight ontologies [8] guarantees that the maximized set is always finite and thus precision and recall can always be computed. The table below presents the measures obtained in our experiments with SMatch on three different datasets (see [6] for details). Comparing the measures obtained with the maximized versions (max) with the measures obtained with the original versions (res), one can notice that the performance of the algorithm is on average better than expected. In [6] we explain why comparing the minimized versions is not meaninful and we conclude that: Recommendation 3. To obtain accurate measures it is fundamental to maximize both the golden standard and the matching result. Dataset pair Precision,% Recall,% min res max min res max 101/304 32.47 9.75 69.67 86.21 93.10 92.79 Topia/Icon 16.87 4.86 45.42 10.73 20.00 42.11 Source/Target 74.88 52.03 48.40 10.35 40.74 53.30 Maximizing a golden standard can also reveal unexpected problems and inconsistencies. For instance, we discovered that in TaxME2 [5] |GS ∩GS−| = 2 and |Max(GS) ∩Max(GS−)| = 2187. In future work we will explore how the size of the golden standard influences the evaluation and how large should be the part covered by GS and GS−, as well as describe methodology for evaluating rich mappings by supporting our recommendations with experimental results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recommendations for Better Quality Ontology Matching Evaluations

Evaluating and comparing different ontology matching techniques is a complex multifaceted problem. Currently, diverse golden standards and various practices are used for evaluations. In this paper we show that, by following certain rules, the quality of the evaluations can be significantly improved, particularly in regard to the accuracy of precision and recall measures obtained.

متن کامل

Centralized Clustering Method To Increase Accuracy In Ontology Matching Systems

Ontology is the main infrastructure of the Semantic Web which provides facilities for integration, searching and sharing of information on the web. Development of ontologies as the basis of semantic web and their heterogeneities have led to the existence of ontology matching. By emerging large-scale ontologies in real domain, the ontology matching systems faced with some problem like memory con...

متن کامل

Overview of the First SEALS Evaluation Campaigns

This report describes the first five SEALS Evaluation Campaigns over the semantic technologies covered by the SEALS project (ontology engineering tools, ontology reasoning systems, ontology matching tools, semantic search tools, and semantic web service tools). It presents the evaluations and test data used in these campaigns and the tools that participated in them along with a comparative anal...

متن کامل

Best Practices for Ontology Matching Tools Evaluation

In the current state of the art in ontology matching, diverse golden standards are used to evaluate the algorithms. In this paper we show that by following appropriate rules in their construction and use, the quality of the evaluations can be significantly improved, particularly in the accuracy of the precision and recall measures obtained.

متن کامل

Evaluating Ontology Matching Systems on Large, Multilingual and Real-world Test Cases

In the field of ontology matching, the most systematic evaluation of matching systems is established by the Ontology Alignment Evaluation Initiative (OAEI), which is an annual campaign for evaluating ontology matching systems organized by different groups of researchers. In this paper, we report on the results of an intermediary OAEI campaign called OAEI 2011.5. The evaluations of this campaign...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Recommendations for Qualitative Ontology Matching Evaluations

نویسندگان

چکیده

منابع مشابه

Recommendations for Better Quality Ontology Matching Evaluations

Centralized Clustering Method To Increase Accuracy In Ontology Matching Systems

Overview of the First SEALS Evaluation Campaigns

Best Practices for Ontology Matching Tools Evaluation

Evaluating Ontology Matching Systems on Large, Multilingual and Real-world Test Cases

عنوان ژورنال:

اشتراک گذاری